Dataset statistics
| Number of variables | 22 |
|---|---|
| Number of observations | 301666 |
| Missing cells | 154566 |
| Missing cells (%) | 2.3% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 42.6 MiB |
| Average record size in memory | 148.0 B |
Variable types
| CAT | 10 |
|---|---|
| NUM | 8 |
| BOOL | 4 |
Latitude has 77283 (25.6%) missing values | Missing |
Longitude has 77283 (25.6%) missing values | Missing |
Unnamed: 0 has unique values | Unique |
day_of_week has 39355 (13.0%) zeros | Zeros |
hour has 23282 (7.7%) zeros | Zeros |
minute has 39858 (13.2%) zeros | Zeros |
Reproduction
| Analysis started | 2021-02-05 14:53:30.157689 |
|---|---|
| Analysis finished | 2021-02-05 14:54:07.357186 |
| Duration | 37.2 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 301666 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 338338.5259 |
|---|---|
| Minimum | 0 |
| Maximum | 660610 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 31470.25 |
| Q1 | 167073.25 |
| median | 347579.5 |
| Q3 | 516833.75 |
| 95-th percentile | 632514.75 |
| Maximum | 660610 |
| Range | 660610 |
| Interquartile range (IQR) | 349760.5 |
Descriptive statistics
| Standard deviation | 195434.7634 |
|---|---|
| Coefficient of variation (CV) | 0.5776308296 |
| Kurtosis | -1.21808208 |
| Mean | 338338.5259 |
| Median Absolute Deviation (MAD) | 174880.5 |
| Skewness | -0.06097040379 |
| Sum | 1.020652297e+11 |
| Variance | 3.819474673e+10 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 455563 | 1 | < 0.1% | |
| 68468 | 1 | < 0.1% | |
| 66421 | 1 | < 0.1% | |
| 72566 | 1 | < 0.1% | |
| 70519 | 1 | < 0.1% | |
| 626820 | 1 | < 0.1% | |
| 621434 | 1 | < 0.1% | |
| 619387 | 1 | < 0.1% | |
| 84860 | 1 | < 0.1% | |
| Other values (301656) | 301656 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 660610 | 1 | < 0.1% | |
| 660609 | 1 | < 0.1% | |
| 660608 | 1 | < 0.1% | |
| 660607 | 1 | < 0.1% | |
| 660606 | 1 | < 0.1% |
Type
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| person search | |
|---|---|
| person and vehicle search | |
| vehicle search | 113 |
| Value | Count | Frequency (%) | |
| person search | 230152 | 76.3% | |
| person and vehicle search | 71401 | 23.7% | |
| vehicle search | 113 | < 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 25 |
|---|---|
| Median length | 13 |
| Mean length | 15.84064164 |
| Min length | 13 |
Part of a policing operation
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| missing | |
|---|---|
| False | |
| True |
| Value | Count | Frequency (%) | |
| missing | 148496 | 49.2% | |
| False | 137070 | 45.4% | |
| True | 16100 | 5.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 7 |
|---|---|
| Median length | 5 |
| Mean length | 5.931135759 |
| Min length | 4 |
| Distinct | 78337 |
|---|---|
| Distinct (%) | 34.9% |
| Missing | 77283 |
| Missing (%) | 25.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 52.51750653 |
|---|---|
| Minimum | 49.892149 |
| Maximum | 57.143856 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 49.892149 |
|---|---|
| 5-th percentile | 50.8307412 |
| Q1 | 51.4989 |
| median | 52.617842 |
| Q3 | 53.424474 |
| 95-th percentile | 54.548465 |
| Maximum | 57.143856 |
| Range | 7.251707 |
| Interquartile range (IQR) | 1.925574 |
Descriptive statistics
| Standard deviation | 1.131710504 |
|---|---|
| Coefficient of variation (CV) | 0.02154920481 |
| Kurtosis | -0.9270221553 |
| Mean | 52.51750653 |
| Median Absolute Deviation (MAD) | 0.896118 |
| Skewness | 0.1884210566 |
| Sum | 11784035.67 |
| Variance | 1.280768665 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 51.402747 | 1243 | 0.4% | |
| 53.403914 | 502 | 0.2% | |
| 51.541264 | 336 | 0.1% | |
| 51.62752 | 335 | 0.1% | |
| 53.404563 | 298 | 0.1% | |
| 53.477512 | 287 | 0.1% | |
| 51.407665 | 270 | 0.1% | |
| 53.407939 | 259 | 0.1% | |
| 53.402122 | 235 | 0.1% | |
| 53.046403 | 223 | 0.1% | |
| Other values (78327) | 220395 | 73.1% | |
| (Missing) | 77283 | 25.6% |
| Value | Count | Frequency (%) | |
| 49.892149 | 1 | < 0.1% | |
| 49.922299 | 1 | < 0.1% | |
| 49.952464 | 1 | < 0.1% | |
| 49.959358 | 1 | < 0.1% | |
| 50.081765 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 57.143856 | 3 | < 0.1% | |
| 56.457531 | 1 | < 0.1% | |
| 56.393853 | 2 | < 0.1% | |
| 55.9846 | 1 | < 0.1% | |
| 55.952496 | 9 | < 0.1% |
| Distinct | 78711 |
|---|---|
| Distinct (%) | 35.1% |
| Missing | 77283 |
| Missing (%) | 25.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -1.339926363 |
|---|---|
| Minimum | -8.053397 |
| Maximum | 1.75648 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | -8.053397 |
|---|---|
| 5-th percentile | -3.028236 |
| Q1 | -2.6047225 |
| median | -1.45732 |
| Q3 | -0.204165 |
| 95-th percentile | 0.937354 |
| Maximum | 1.75648 |
| Range | 9.809877 |
| Interquartile range (IQR) | 2.4005575 |
Descriptive statistics
| Standard deviation | 1.368559346 |
|---|---|
| Coefficient of variation (CV) | -1.021369072 |
| Kurtosis | -0.6463855179 |
| Mean | -1.339926363 |
| Median Absolute Deviation (MAD) | 1.206935 |
| Skewness | 0.1067613875 |
| Sum | -300656.6971 |
| Variance | 1.872954683 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| -0.509813 | 1243 | 0.4% | |
| -2.981499 | 502 | 0.2% | |
| -0.003338 | 336 | 0.1% | |
| -0.749093 | 335 | 0.1% | |
| -2.982401 | 297 | 0.1% | |
| -2.226586 | 286 | 0.1% | |
| -0.512615 | 270 | 0.1% | |
| -2.977364 | 259 | 0.1% | |
| -2.980826 | 235 | 0.1% | |
| -2.194767 | 223 | 0.1% | |
| Other values (78701) | 220397 | 73.1% | |
| (Missing) | 77283 | 25.6% |
| Value | Count | Frequency (%) | |
| -8.053397 | 14 | < 0.1% | |
| -8.011806 | 1 | < 0.1% | |
| -7.98653 | 5 | < 0.1% | |
| -7.980785 | 11 | < 0.1% | |
| -7.97146 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1.75648 | 1 | < 0.1% | |
| 1.75643 | 2 | < 0.1% | |
| 1.75617 | 1 | < 0.1% | |
| 1.756072 | 5 | < 0.1% | |
| 1.755927 | 1 | < 0.1% |
Gender
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| male | |
|---|---|
| female | |
| other | 265 |
| Value | Count | Frequency (%) | |
| male | 270765 | 89.8% | |
| female | 30636 | 10.2% | |
| other | 265 | 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.203990506 |
| Min length | 4 |
Age range
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| 18-24 | |
|---|---|
| 25-34 | |
| over 34 | |
| 10-17 | |
| under 10 | 305 |
| Value | Count | Frequency (%) | |
| 18-24 | 102645 | 34.0% | |
| 25-34 | 73119 | 24.2% | |
| over 34 | 66063 | 21.9% | |
| 10-17 | 59534 | 19.7% | |
| under 10 | 305 | 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 8 |
|---|---|
| Median length | 5 |
| Mean length | 5.441020864 |
| Min length | 5 |
Officer-defined ethnicity
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| white | |
|---|---|
| black | |
| asian | |
| other | 5839 |
| mixed | 1816 |
| Value | Count | Frequency (%) | |
| white | 238002 | 78.9% | |
| black | 31776 | 10.5% | |
| asian | 24233 | 8.0% | |
| other | 5839 | 1.9% | |
| mixed | 1816 | 0.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Legislation
Categorical
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| misuse of drugs act 1971 (section 23) | |
|---|---|
| police and criminal evidence act 1984 (section 1) | |
| missing | |
| criminal justice and public order act 1994 (section 60) | 2736 |
| firearms act 1968 (section 47) | 1849 |
| Other values (13) | 1001 |
| Value | Count | Frequency (%) | |
| misuse of drugs act 1971 (section 23) | 176838 | 58.6% | |
| police and criminal evidence act 1984 (section 1) | 91822 | 30.4% | |
| missing | 27420 | 9.1% | |
| criminal justice and public order act 1994 (section 60) | 2736 | 0.9% | |
| firearms act 1968 (section 47) | 1849 | 0.6% | |
| criminal justice act 1988 (section 139b) | 680 | 0.2% | |
| poaching prevention act 1862 (section 2) | 148 | < 0.1% | |
| psychoactive substances act 2016 (s36(2)) | 90 | < 0.1% | |
| wildlife and countryside act 1981 (section 19) | 31 | < 0.1% | |
| police and criminal evidence act 1984 (section 6) | 15 | < 0.1% | |
| Other values (8) | 37 | < 0.1% |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Length
| Max length | 55 |
|---|---|
| Median length | 37 |
| Mean length | 38.05770289 |
| Min length | 7 |
Object of search
Categorical
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| controlled drugs | |
|---|---|
| offensive weapons | |
| article for use in theft | |
| stolen goods | |
| articles for use in criminal damage | 6476 |
| Other values (11) | 13741 |
| Value | Count | Frequency (%) | |
| controlled drugs | 189975 | 63.0% | |
| offensive weapons | 35356 | 11.7% | |
| article for use in theft | 30030 | 10.0% | |
| stolen goods | 26088 | 8.6% | |
| articles for use in criminal damage | 6476 | 2.1% | |
| anything to threaten or harm anyone | 5204 | 1.7% | |
| firearms | 2944 | 1.0% | |
| evidence of offences under the act | 1907 | 0.6% | |
| fireworks | 1710 | 0.6% | |
| psychoactive substances | 1701 | 0.6% | |
| Other values (6) | 275 | 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 42 |
|---|---|
| Median length | 16 |
| Mean length | 17.35245271 |
| Min length | 8 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 294.6 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) | |
| False | 204070 | 67.6% | |
| True | 97596 | 32.4% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 294.6 KiB |
| False | |
|---|---|
| True | 10334 |
| Value | Count | Frequency (%) | |
| False | 291332 | 96.6% | |
| True | 10334 | 3.4% |
station
Categorical
| Distinct | 41 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| merseyside | |
|---|---|
| essex | 18813 |
| thames-valley | 17647 |
| west-yorkshire | 16548 |
| hampshire | 13357 |
| Other values (36) |
| Value | Count | Frequency (%) | |
| merseyside | 40864 | 13.5% | |
| essex | 18813 | 6.2% | |
| thames-valley | 17647 | 5.8% | |
| west-yorkshire | 16548 | 5.5% | |
| hampshire | 13357 | 4.4% | |
| south-yorkshire | 13131 | 4.4% | |
| hertfordshire | 12928 | 4.3% | |
| kent | 12878 | 4.3% | |
| surrey | 10635 | 3.5% | |
| avon-and-somerset | 9700 | 3.2% | |
| Other values (31) | 135165 | 44.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 18 |
|---|---|
| Median length | 10 |
| Mean length | 10.61584004 |
| Min length | 3 |
target
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 294.6 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) | |
| False | 241105 | 79.9% | |
| True | 60561 | 20.1% |
Outcome_true
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 294.6 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) | |
| False | 211580 | 70.1% | |
| True | 90086 | 29.9% |
ethnicity
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| white | |
|---|---|
| missing | |
| black | 20362 |
| asian | 17947 |
| mixed | 8039 |
| Value | Count | Frequency (%) | |
| white | 212777 | 70.5% | |
| missing | 40705 | 13.5% | |
| black | 20362 | 6.7% | |
| asian | 17947 | 5.9% | |
| mixed | 8039 | 2.7% | |
| other | 1836 | 0.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 7 |
|---|---|
| Median length | 5 |
| Mean length | 5.269868 |
| Min length | 5 |
year
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| 2019 | |
|---|---|
| 2018 |
| Value | Count | Frequency (%) | |
| 2019 | 184070 | 61.0% | |
| 2018 | 117596 | 39.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
month
Real number (ℝ≥0)
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.844344407 |
|---|---|
| Minimum | 1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 7 |
| Q3 | 10 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.437626734 |
|---|---|
| Coefficient of variation (CV) | 0.5022580001 |
| Kurtosis | -1.20347789 |
| Mean | 6.844344407 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.1367373238 |
| Sum | 2064706 |
| Variance | 11.81727756 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 10 | 30902 | 10.2% | |
| 11 | 29759 | 9.9% | |
| 12 | 26984 | 8.9% | |
| 8 | 26267 | 8.7% | |
| 9 | 25824 | 8.6% | |
| 5 | 24476 | 8.1% | |
| 7 | 24077 | 8.0% | |
| 6 | 23965 | 7.9% | |
| 4 | 23914 | 7.9% | |
| 3 | 22987 | 7.6% | |
| Other values (2) | 42511 | 14.1% |
| Value | Count | Frequency (%) | |
| 1 | 22371 | 7.4% | |
| 2 | 20140 | 6.7% | |
| 3 | 22987 | 7.6% | |
| 4 | 23914 | 7.9% | |
| 5 | 24476 | 8.1% |
| Value | Count | Frequency (%) | |
| 12 | 26984 | 8.9% | |
| 11 | 29759 | 9.9% | |
| 10 | 30902 | 10.2% | |
| 9 | 25824 | 8.6% | |
| 8 | 26267 | 8.7% |
day
Real number (ℝ≥0)
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.77049452 |
|---|---|
| Minimum | 1 |
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 16 |
| Q3 | 23 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.741345795 |
|---|---|
| Coefficient of variation (CV) | 0.554284825 |
| Kurtosis | -1.175055221 |
| Mean | 15.77049452 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.003070489734 |
| Sum | 4757422 |
| Variance | 76.41112631 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 13 | 10402 | 3.4% | |
| 20 | 10346 | 3.4% | |
| 9 | 10207 | 3.4% | |
| 18 | 10194 | 3.4% | |
| 23 | 10150 | 3.4% | |
| 21 | 10127 | 3.4% | |
| 17 | 10098 | 3.3% | |
| 12 | 10094 | 3.3% | |
| 19 | 10085 | 3.3% | |
| 22 | 10041 | 3.3% | |
| Other values (21) | 199922 | 66.3% |
| Value | Count | Frequency (%) | |
| 1 | 9678 | 3.2% | |
| 2 | 9381 | 3.1% | |
| 3 | 9792 | 3.2% | |
| 4 | 9538 | 3.2% | |
| 5 | 9951 | 3.3% |
| Value | Count | Frequency (%) | |
| 31 | 6137 | 2.0% | |
| 30 | 8837 | 2.9% | |
| 29 | 8923 | 3.0% | |
| 28 | 9757 | 3.2% | |
| 27 | 9529 | 3.2% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.064080805 |
|---|---|
| Minimum | 0 |
| Maximum | 6 |
| Zeros | 39355 |
| Zeros (%) | 13.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 1.954546439 |
|---|---|
| Coefficient of variation (CV) | 0.6378899787 |
| Kurtosis | -1.206874938 |
| Mean | 3.064080805 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.0713532846 |
| Sum | 924329 |
| Variance | 3.820251782 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5 | 48029 | 15.9% | |
| 4 | 47851 | 15.9% | |
| 3 | 43598 | 14.5% | |
| 2 | 42743 | 14.2% | |
| 1 | 40808 | 13.5% | |
| 0 | 39355 | 13.0% | |
| 6 | 39282 | 13.0% |
| Value | Count | Frequency (%) | |
| 0 | 39355 | 13.0% | |
| 1 | 40808 | 13.5% | |
| 2 | 42743 | 14.2% | |
| 3 | 43598 | 14.5% | |
| 4 | 47851 | 15.9% |
| Value | Count | Frequency (%) | |
| 6 | 39282 | 13.0% | |
| 5 | 48029 | 15.9% | |
| 4 | 47851 | 15.9% | |
| 3 | 43598 | 14.5% | |
| 2 | 42743 | 14.2% |
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.53988517 |
|---|---|
| Minimum | 0 |
| Maximum | 23 |
| Zeros | 23282 |
| Zeros (%) | 7.7% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 9 |
| median | 15 |
| Q3 | 20 |
| 95-th percentile | 23 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 11 |
Descriptive statistics
| Standard deviation | 7.385004255 |
|---|---|
| Coefficient of variation (CV) | 0.5454259147 |
| Kurtosis | -0.89783832 |
| Mean | 13.53988517 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | -0.5640549754 |
| Sum | 4084523 |
| Variance | 54.53828784 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 23 | 27105 | 9.0% | |
| 0 | 23282 | 7.7% | |
| 22 | 17712 | 5.9% | |
| 21 | 16910 | 5.6% | |
| 20 | 16889 | 5.6% | |
| 19 | 16714 | 5.5% | |
| 15 | 16440 | 5.4% | |
| 16 | 16129 | 5.3% | |
| 14 | 16019 | 5.3% | |
| 18 | 15926 | 5.3% | |
| Other values (14) | 118540 | 39.3% |
| Value | Count | Frequency (%) | |
| 0 | 23282 | 7.7% | |
| 1 | 14413 | 4.8% | |
| 2 | 9887 | 3.3% | |
| 3 | 7036 | 2.3% | |
| 4 | 4291 | 1.4% |
| Value | Count | Frequency (%) | |
| 23 | 27105 | 9.0% | |
| 22 | 17712 | 5.9% | |
| 21 | 16910 | 5.6% | |
| 20 | 16889 | 5.6% | |
| 19 | 16714 | 5.5% |
| Distinct | 60 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.81117527 |
|---|---|
| Minimum | 0 |
| Maximum | 59 |
| Zeros | 39858 |
| Zeros (%) | 13.2% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 5 |
| median | 22 |
| Q3 | 40 |
| 95-th percentile | 55 |
| Maximum | 59 |
| Range | 59 |
| Interquartile range (IQR) | 35 |
Descriptive statistics
| Standard deviation | 18.4997296 |
|---|---|
| Coefficient of variation (CV) | 0.7769347538 |
| Kurtosis | -1.256784146 |
| Mean | 23.81117527 |
| Median Absolute Deviation (MAD) | 17 |
| Skewness | 0.2071741977 |
| Sum | 7183022 |
| Variance | 342.2399952 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 39858 | 13.2% | |
| 1 | 21476 | 7.1% | |
| 30 | 21473 | 7.1% | |
| 45 | 12853 | 4.3% | |
| 15 | 12805 | 4.2% | |
| 20 | 12564 | 4.2% | |
| 50 | 11909 | 3.9% | |
| 10 | 11464 | 3.8% | |
| 40 | 11374 | 3.8% | |
| 5 | 7982 | 2.6% | |
| Other values (50) | 137908 | 45.7% |
| Value | Count | Frequency (%) | |
| 0 | 39858 | 13.2% | |
| 1 | 21476 | 7.1% | |
| 2 | 2730 | 0.9% | |
| 3 | 2609 | 0.9% | |
| 4 | 2601 | 0.9% |
| Value | Count | Frequency (%) | |
| 59 | 2155 | 0.7% | |
| 58 | 2423 | 0.8% | |
| 57 | 2252 | 0.7% | |
| 56 | 2289 | 0.8% | |
| 55 | 7648 | 2.5% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Unnamed: 0 | Type | Part of a policing operation | Latitude | Longitude | Gender | Age range | Officer-defined ethnicity | Legislation | Object of search | Outcome linked to object of search | Removal of more than just outer clothing | station | target | Outcome_true | ethnicity | year | month | day | day_of_week | hour | minute | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | person search | True | NaN | NaN | male | 18-24 | asian | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | asian | 2019 | 12 | 1 | 6 | 0 | 0 |
| 1 | 1 | person search | True | NaN | NaN | male | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | missing | 2019 | 12 | 1 | 6 | 0 | 9 |
| 2 | 2 | person search | True | NaN | NaN | female | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | white | 2019 | 12 | 1 | 6 | 0 | 10 |
| 3 | 3 | person search | False | NaN | NaN | male | 18-24 | asian | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | missing | 2019 | 12 | 1 | 6 | 0 | 10 |
| 4 | 4 | person search | True | 50.368247 | -4.126646 | male | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | missing | 2019 | 12 | 1 | 6 | 0 | 12 |
| 5 | 5 | person search | True | NaN | NaN | male | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | white | 2019 | 12 | 1 | 6 | 0 | 13 |
| 6 | 6 | person search | True | NaN | NaN | male | 25-34 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | white | 2019 | 12 | 1 | 6 | 0 | 16 |
| 7 | 7 | person search | True | NaN | NaN | male | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | white | 2019 | 12 | 1 | 6 | 0 | 25 |
| 8 | 8 | person search | True | NaN | NaN | male | 18-24 | black | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | black | 2019 | 12 | 1 | 6 | 0 | 25 |
| 9 | 9 | person search | True | NaN | NaN | male | 25-34 | black | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | devon-and-cornwall | False | False | black | 2019 | 12 | 1 | 6 | 0 | 35 |
Last rows
| Unnamed: 0 | Type | Part of a policing operation | Latitude | Longitude | Gender | Age range | Officer-defined ethnicity | Legislation | Object of search | Outcome linked to object of search | Removal of more than just outer clothing | station | target | Outcome_true | ethnicity | year | month | day | day_of_week | hour | minute | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 301656 | 660601 | person and vehicle search | False | 51.156811 | -1.859133 | female | 25-34 | white | misuse of drugs act 1971 (section 23) | controlled drugs | True | False | wiltshire | True | True | white | 2018 | 8 | 25 | 5 | 19 | 8 |
| 301657 | 660602 | person search | False | 51.446286 | -2.013650 | male | 18-24 | white | police and criminal evidence act 1984 (section 1) | offensive weapons | False | False | wiltshire | False | False | white | 2018 | 8 | 26 | 6 | 10 | 0 |
| 301658 | 660603 | person search | False | 51.720270 | -1.953499 | female | 10-17 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | wiltshire | False | False | missing | 2018 | 8 | 26 | 6 | 22 | 30 |
| 301659 | 660604 | person search | missing | NaN | NaN | male | 25-34 | white | police and criminal evidence act 1984 (section 1) | article for use in theft | False | False | wiltshire | False | False | white | 2018 | 8 | 27 | 0 | 19 | 50 |
| 301660 | 660605 | person search | missing | NaN | NaN | female | 25-34 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | wiltshire | False | True | white | 2018 | 8 | 28 | 1 | 22 | 25 |
| 301661 | 660606 | person search | missing | NaN | NaN | male | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | wiltshire | False | False | white | 2018 | 8 | 29 | 2 | 2 | 45 |
| 301662 | 660607 | person and vehicle search | False | 51.540219 | -1.764708 | male | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | True | False | wiltshire | True | True | white | 2018 | 8 | 29 | 2 | 21 | 0 |
| 301663 | 660608 | person search | False | 51.540219 | -1.764708 | male | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | wiltshire | False | False | white | 2018 | 8 | 29 | 2 | 21 | 10 |
| 301664 | 660609 | person search | False | 51.540219 | -1.764708 | male | 18-24 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | wiltshire | False | False | white | 2018 | 8 | 29 | 2 | 21 | 15 |
| 301665 | 660610 | person search | missing | NaN | NaN | female | over 34 | white | misuse of drugs act 1971 (section 23) | controlled drugs | False | False | wiltshire | False | True | white | 2018 | 8 | 30 | 3 | 13 | 15 |